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ABSTRACT 

We present an analysis of the X-ray properties of sources detected in the 13 H XMM-Newton 
deep (200ks) field. In order to constrain the absorbed AGN population, we use extensive 
Monte Carlo simulations to directly compare the X-ray colours of observed sources with those 
predicted by several model distributions. In particular, we have carried out our comparisons 
over the entire 0.2-10 keV energy range of the XMM-Newton cameras, making our analysis 
sensitive to a large range of absorbing column densities. We have tested the simplest form 
of the unified scheme, whereby the intrinsic luminosity function of absorbed AGN is set to 
be the same as that of their unabsorbed brethren, coupled with various model distributions of 
absorption. Of the tested distributions, the best fitting model has the fraction of AGN with 
absorbing column Nh, proportional to (logA^) 8 - We have also tested two extensions to the 
unified scheme: an evolving absorption scenario, in which the fraction of absorbed sources 
is larger at higher redshifts, and a luminosity dependent model in which high luminosity 
AGN are less likely to be absorbed. Both of these models provide poorer matches to the 
observed X-ray colour distributions than the best fitting simple unified model. We find that 
a luminosity dependent density evolution luminosity function reproduces poorly the 0.5-2 
keV source counts seen in the 13 H field. Field to field variations could be the cause of this 
disparity. Computing the X-ray colours with a simple absorbed power-law spectral model is 
found to over-predict, by a factor of two, the fraction of hard sources that are completely 
absorbed below 0.5 keV, implying that an additional source of soft-band flux must be present 
in a number of the absorbed sources. The tested synthesis models predict that around 16% of 
the detections in the 13 H field are due to AGN at z > 3. However, so far, only a single AGN 
with z > 3 has been identified in our approximately 50% complete optical spectroscopy follow 
up program. Finally, we use our simulations to demonstrate the efficacy of a hardness ratio 
selection scheme at selecting absorbed sources for further study. Using this selection scheme, 
we show that around 40% of the 13 H sample are expected to be AGN with Nh > 10 22 crrT 2 . 
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1 INTRODUCTION 

Deep X-ray surveys have progressively resolved an increas- 
ing fraction of the soft X-ray background (XRB), into faint 
point- like sources. Mos t recently, the ultr a-deep Chandra sur- 
veys iRosati et al]|2002l iBrandt et alJl200ll) . have resolved over 
90% of the 0.5-2 keV XRB. Source counts have been mea- 
sured in these fields to limiting fluxes of a few times 10~' 7 erg 
s -1 cm -2 , with corres ponding sky densities of over 10 4 sources 
deg~ 2 , jAlexander et alj2003h . Optical identification of faint X-ray 



* E-mail:td@mssl. ucl.ac.uk 



sources reveals a heterogeneous mixture of objects, with the dom- 
inant class being active galactic nuclei (AGN), IPage et al]|2003l 
iBareer et all2003l) . These X-ray selected AGN have a range of lu- 
minosities spanning several orders of magnitude, are found at red- 
shifts up to 5, and exhibit a wide rang e of observational properties. 
The unification scheme of l Antonuccil fl 993 ) attributes the wide va- 
riety of radio, optical and X-ray characteristics seen between AGN 
classes, to the geometry and relative orientation of a dusty torus sur- 
rounding the central black hole. The dusty torus model is used to to 
explain both the absence of broadened optical lines, and the strong 
X-ray absorption seen in many AGN. The simplest version of the 
unified scheme is one in which the geometry of the inner regions, 
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and hence the distribution of absorbing column densities (Nh), is 
independent of all other AGN properties. A number of refinements 
to this model have been suggested, in order to explain recent obser- 
vations which are at odds with the simple unified scheme. 

The demographics of the 0.5-2 keV AGN population have 
been well measured, albeit to relatively bright flux limits, usin g 
ROSAT /Ein s tein samples, for example by l^accjicjrro_et_alJ ^W]), 
iBovle et alJ <1994 . IPaee et alJ Il997h and iMivaii et alJ (2000). 
Each of these studies indicates strong evolution of the AGN X- 
ray luminosity function (XLF), with a peak in AGN activity at 
1 < z < 2, although the best description of this evolution is dis- 
puted. The soft XLF is characterised by a double power-law with a 
knee at Lqs-2 ~ 10 44 erg s -1 . The bulk of the XRB is the result of 
the integrated emission of AGN having luminosities in the vicinity 
of this knee. 

The spectral slope of the 0.1-1 keV XRB is T ~ 2.0 
I Mivai i et alll998l) . approximat ely matching that of a typical soft 
band detected AGN, F = 1.9, JPaee et alj|2003l IPiconcelli et alJ 
120031 iMateos etalj |2005). However, at harder X-ray energies the 
XRB flattens dramatically, and h as T ~ 1.4 i n the 1-10 keV band 
(e.g. IMivaii et alJ 1 19981 Lumb et alJ l2002l |PeLuca & Molendil 
2004), and cannot be produced by a simple superposition of canon- 
ical T = 1.9 AGN spectra. Clearly, some additional sources of hard 
X-rays must exist, and a large population of heavily absorbed AGN 
is postulated to fill this role. Population synthesis models have been 
formu lated, such as those of llVIadau et alJ 11994 . IComastri et alJ 
ll995t) . and lGilli et alJ 1200 ll) . and have been successful in repro- 
ducing both the broad band spectrum of the XRB, as well as the 
AGN source counts observed below 10 keV. lGilli et alJ i200lh pre- 
dict that the majority (> 80%), of the AGN population is heavily 
absorbed. 

A number of deep surveys of the 2-10 keV sky have been per- 
formed with Chandra and XMM-Newton (e.g. lBrandt et aljEoOll 
iGiacconi et alj2002llHasinger et all200ll) . In the 2Ms observations 
of the Chandra Deep Field-North (CDF-North), sources have been 
detected to a limiting flux of ~ 1.4 x 10~ 16 erg s -1 cnr 2 in the 2- 
10 keV band l Alexan der et all2003l) . The lMs observations of the 
Chandra Deep Field-South (CDF-South), are estimate d to have re - 
solved more than 85% of the 2-10 keV XRB <Rosati et alJl2002h . 
However, due to the relatively large uncertainty (~ 20%) in the 
total intensity of the 2-10 keV cosmic XRB, th e precise resolv ed 
fraction is still somewhat unknown <De Luca & Molendill2004 . It 
has also been shown that extrapolating the source counts seen in 
the Chandra deep fields to much lower fluxes does not reproduce 
fully the total level of the cosmic XRB, sugges ting the existence of 
an additional very faint X -ray population (e.g. lMoretti et all2003l 
iDe Luc a & Mole ndl2004 . 

The luminosity function of AGN that are selecte d in the 2-10 
keV band, has been measured by several studies. lCowie et al <2003l) 
demonstrated that strong evolution of the hard XLF has occurred 
between the z = 2 - 4 and z = 0.1 - 1 epochs. Rjeda et all 120031) 
used a sample of 247 AGN, including some from the CDF-North, 
to show that the XLF is best represented by a complex luminosity 
dependent density evolution (LDDE) model. It should be noted that 
this sample does not reach to the limiting flux of the CDF-N data, 
but to 52-io ~ 4 x 10~ ls erg s cnr 2 , where the optical identifica- 
tions are reasonably complete. 

The major stumbling block in understanding the nature of 
the faint, hard X-ray selected AGN population, is the difficulty 
of obtaining optical spectroscopic identifications. The soft X- 
ray selected sam ples used for XLF d eterminations, for example 
IPage et aljll997l and IMivaii etafll200d are primarily, or wholly, 



composed of bright (R < 22), broad-line AGN counterparts, which 
are relatively easy to optically identify. At fainter X-ray fluxes, such 
as those probed in the CDF-North, AGN without broad lines, to- 
gether with normal galaxies, make up a large fraction of the iden- 
tified objects l Ba rger et all20 03). Those non-broad-line AGN hav- 
ing spectroscopic identifications are almost all found with z < 1, 
in contrast to the peak of the broad-line sample, which lies at 
1 < z < 2. The large numbers of z > 1 type-II quasars (having 
logL x > 44, logA'// > 22), predicted by synthesis models have 
not been detected in these surveys. The obvious conclusion to be 
drawn is that the absorbed and unabsorbed AGN are taken from 
separate populations, a direct contradiction of the simplest unified 
scheme. However, the optical follow up programs in these deep 
Chandra fields are by no means complete to the faint X-ray limit. 
For example, in the CDF-North, only 55% of the X-ray detections 
have optical counterparts with R < 24 iBarger et all2003l) . Objects 
fainter than this limit are practically unidentifiable with current op- 
tical spectroscopic techniques. There is a wide range of X-ray to 
optical flux ratios, fxl f op t in these samples, and so the unidenti- 
fiable objects are not necessarily the faintest X-ray sources, and 
produce a significant fraction of the XRB. The nature of these op- 
tically faint, hard X-ray objects is still not well understood. They 
could be narrow line AGN at z > 1, with their strongest emission 
lines shifted out of the optical band. A number of optically-faint 
Chandra sources have been identified from their near IR proper- 
ties to be AG N located in lumin ous, evolved host galaxies at high 
redshifts (e.g. ICowie et al l2001). Alternatively, the unidentifiable 
sources may be AGN embedded in optically thick dusty galaxies at 
moderate redshifts, the faintness of the hosts precluding identifica- 
tion iFabian et al. 1998; Severgnini et al. 2003). 

Optical studies of nearby Seyfert galaxies have found that 
the ratio, R, of type 2 to type 1 Seyferts, is approximately 4 
I Maio lino & Riekelll99; ) The hard X-ray study of these type 2 
Sevferts bT lRlsaUtieHri 1 19991) discovered a wide distribution of 
absorbing columns, but with ~ 75% of the AGN having Nh > 10 23 
cm 4 . However, this study was limited to the very local universe, 
(d) = 24 Mpc, and to low nuclear luminosities, M B > -22; the 
behaviour in the rest of the redshift-luminosity plane is less well 
understood. The distribution of absorption in X-ray selected AGN 
is poorly constrained, the prime difficulty being that the greater 
an object's Nh, the lower is its chance of being detected, or op- 
tically identified. There have been several published cases of AGN 
in which the absorbing columns inferred by optical and X-ray mea- 
surements differ s ignificantly (e.g. Mai olino et all200ll|Page et alJ 
l200ll iLoaring. Page & Ramsavl F2003). Despite these limitations, 
the Nh distribution, for hard X-ray sel ected AGN, has be en esti- 
mated, for relatively bright samples, bv lUeda et alJ 12003). These 
authors have primarily determined N H in their optically identified 
sample by examination of X-ray hardness ratio between the 0.5-2 
and 2-10 keV bands. The distribution of absorption within the sam- 
ple is described with a luminosity dependent N H model, in which 
high intrinsic luminosity AGN are less likely to be heavily ab- 
sorbed. This model does require some additional Compton thick 
AGN to reproduce fully the XRB when extrapolated to harder en- 
ergies. 

So, despite the progress made in resolving, and to some ex- 
tent optically identifying, the hard X-ray population, it has still not 
been possible to delimit the distribution of absorption in AGN. This 
problem is particularly acute for the heavily absorbed, high-z AGN; 
few of which have been detected and identified. However, by bet- 
ter constraining the f(N H ) in faint AGN, we can hope to answer 
many questions about the geometry, composition and evolution of 
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the dusty torus. For example, the strength of the luminosity depen- 
dence of f(N H ) can tell us about how the radiation from the ac- 
cretion disk influences the surrounding torus, and/or how the torus 
geometry scales with black hole mass. If some redshift evolution of 
f(N H ) is detected, is it related to the overall evolution of the AGN 
luminosity function? Does the dusty torus form coevally with the 
black hole, and is the amount of absorbing material related to the 
black hole mass? 

In this study we use X-ray hardness ratios as an indicator of 
absorp tion in the spectra of the sources in ou r sample. Many authors 
(e.g. l Mainieri et aljEbollDella Ceca et alJl2004l and lPerola etail 
2004) have shown that colour based analyses are effective in de- 
riving the properties of XMM-Newton sources which are detected 
with too few counts to permit full spectral fitting. In these optically 
identified samples, the AGN with and without broad emission lines 
are seen to occupy separate regions in X-ray colour-colour plots. 

We present in this paper an analysis of the X-ray properties 
of sources detected by XMM-Newton in the 13 H deep field. In sec- 
tion[3]we describe the XMM-Newton observations. In order to es- 
cape the possible biases introduced by the incompleteness of opti- 
cal identification programmes, we have devised a method to probe 
the f(N H ) of our sample. Our technique does not depend on op- 
tical identification of the sample, permitting the inclusion of the 
optically faint X-ray detections. The simulations use a model XLF 
to describe the intrinsic distribution of all AGN in redshift and (de- 
absorbed) luminosity space; this is coupled to a model N H function, 
to generate a synthetic AGN population (described in section |4j. 
We simulate how this model population would be seen with XMM- 
Newton, accounting for both the selection function caused by the 
complex EPIC detector imaging characteristics, and the nuances of 
the source detection process (see section[5j. The output products 
of the simulation allow direct comparison of each of the model N H 
distributions with the 13 w sample. We then compare the predictions 
of several simple unified scheme models of the N H distribution, 
by using a statistical comparison of the X-ray colour distributions 
found in the data and models (section 1531 . Furthermore, we test 
two examples of more complex N H distribution models taken from 
the literature, and compare them to the 12> H sample. In section lo"2l 
we compare the source counts found in the 13 w field and those pre- 
dicted by the model simulations. Finally, in section |S| we discuss 
our results and their implications for AGN torus models, and for 
the evolving XLF model of lMivaii et"ail<2000h . 

Throughout the paper we use a lambda-dominated flat cosmol- 
ogy with H = 70 km s^Mpc -1 , (Qm.Ha) = (0.3,0.7). L Emin - Emax 
refers to an object's de-absorbed X-ray luminosity in the observed 
E min - E max band. N H is the equivalent hydrogen column density in 
units of cirT 2 . We refer to the Nh distribution function as f(N H ), 
and define it to be the fraction of all AGN, per unit \ogN H , which 
have absorbing column N H . We define a power-law spectrum to be 
F oc E~ r , where F is the flux in units of photons keV~' s cirT 2 , E 
is the photon energy in keV, and T is the photon index. 



2 MODELS OF THE DISTRIBUTION OF ABSORPTION 
IN AGN 

The unified model attributes the X-ray absorption seen in AGN 
to a dusty torus s urrounding the central super massive black hole 
lAntonucciiri993l) . For certain orientations, the torus obscures the 
observer's line of sight to the X-ray/UV emitting accretion disk. 
There have been a wide range of absorbing columns inferred from 
the X-ray spectra of various AGN, ranging from effectively zero 



absorption, to column densities over 10 25 cirT 2 . The unified scheme 
states that all AGN are intrinsically similar, and that the observa- 
tional differences between the various AGN types are due to the 
orientation of the observer. Therefore, it is the geometry of the dust 
torus which determines the amount of obscuring material along the 
observer's line of sight to the central X-ray emitting regions. If we 
assume that all AGN have the same geometry, then it is only the 
properties of the torus which determine the observed f{N H ) in the 
AGN population as a whole. A typical zeroth order approach is to 
speculate that this characteristic geometry is independent of the lu- 
minosity of the central engine, and has not evolved over cosmolog- 
ical timescales. However, alternative scenarios are po stulated, for 
example bv iGilli et ail 1200 lh and lUeda etail 120031). wh ich im- 
ply more complex forms for /(Nh)- The lGilli et alj 1200 11 model 
suggests that some evolution of the average torus properties has oc- 
curred over cosmol ogical timescales, with more absorbed AGN at 
high redshifts. The Uedaetal. (2003) model predicts that the ge- 
ometry, specifically the opening angle, of the obscuring torus is 
determined by the luminosity of the central engine. 

In this study, we compare the predictions of several different 
forms of /(Nh)- A very simple description of /(N H ), is a contin- 
uous distribution, in which the number of AGN per unit logN H is 
proportional to (logA^y 3 , over the range 19 < log N H < 25. 
A similar parameterisation was adopted in the synthesis model of 
Gandhi & Fabian ( 2003), who found that setting /3 = 2, 5 or 8 gave 
acceptable fits to the XRB (it should be noted that the authors used 
a separately evolving luminosity function for absorbed AGN). We 
have tested three such f(N H ), and refer to them as the p = 2, f} = 5 
and P = 8 models. 

Model A o flGilli et alH ioOl) com bines the f(N H ) obse rved in 
the optically selected Seyfert-2 galaxies iRisaliti et alJl999l) . with a 
fixed ratio, R, of absorbed to unabsorbed AGN. We have tested two 
similar f(N H ) models here, where R is constant and set to 4, and 8, 
and refer to these as the R = 4, and R = 8 mo dels respectively. The 
measured distribution of Risaliti et al. ( 1999) contains a number of 
AGN where only the lower or upper limit on absorption is known. 
So, for the purposes of our study, all those absorbed AGN having 
logN H < 22 are evenly distributed in the 21 < logN H < 22 interval, 
and those with logA^ > 25 are set to have logN H = 25 . We also 
compare the predictions of Model B of Gilli et al. 2001, which is 
similar to Model A, above, but with R = 4 at z = increasing to 
R = 10 for z > 1.32; we refer to this as the R = R(z) model. 

In addition, we tes t the luminosity dependent /(Nh) func- 
tion of lUeda et alj2003l in which high luminosity AGN are more 
likely to have lower absorbing columns. We have converted from 
our observed frame Lo.5-2 to rest-frame Li-io using the specific 
spectral slope, and redshift of each simulated AGN; we refer to 
this as the R = R(L X ) model. The R = R(L X ) model distribution 
does not include any AGN having absorbing columns outside 20 < 
logN H < 24. Finally, we employ a zero absorption scenario to pro- 
vide a base-line to the more realistic models; we call this the R = 
model. A subset of the model /(N H ) are shown in fig.Q 

To reiterate, for all the tested models, we take only the /(Nh) 
part from the published model, and always use the LDDE1 model 
XLF of lMivaii et ail (2000) to describe both absorbed and unab- 
sorbed AGN. 

Recent X-ray spectral fitting analyses have found that after 
absorption is considere d, the mean AGN pho ton inde x T is ~ 1.9 
IPiconcelli etaill2003L Ipase et aljEobH iMateos et alJl2005l) . The 
absorbed and unabsorbed AGN show similar mean slopes, and 
there is no significant evolution seen even up to z = 5. However, 
there is still an intrinsic scatter of slopes about the mean, and this 
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Figure 1. The distribution of absorbing column densities for a subset of the 
tested models. AGN with logA^ < 19 are shown in the leftmost bin, and 
those with log Nh > 25 in the rightmost bin. 

will have some effect on the observed colours and/or detectabil- 
ity of sources. Therefore, we have used a Gaussian distribution of 
slopes g(T), to represent the spectra of the simulated AGN, with 
T = 1.9, and cr r = 0.2. We have not conside red sources w ith slopes 
outside the range 1.2 < T < 2.6. |Piconcelli et aT] 120031) found no 
apparent dependence of de-absorbed T on z, Nh, or flux, therefore 
we assume that g(T) is independent of the other AGN spectral pa- 
rameters. 



3 OBSERVATIONS 

The XMM-Newton data consist of three observations of the 
13 H field totalling 200 ks, of which ~ 120ks is unaffected by soft 
proton flaring. T his field was the loca tion of one of the deepest 
ROSAT surveys lMcH ardvetai]|l998t) . due to its unusually low 
Galactic absorbing column (N H ~ 8 x 10 19 cnT 2 ). In addition, the 
13" field has been the subject of a host of multi- wavelength obser- 
vations, including a mosaic of four 30ks Chandra pointings cov- 
ering the XMM-Newton field of view ( McHardv et alj|2003|), and 
exten sive, very deep radio mapping with the VLA ISevmour et alJ 
2004). 

The data from the European Photon Imaging Cameras (EPIC) 
were reduced with version 6.0 of the Science Analysis Software 
(SAS) task-set to produce images in four energy bands (0.2-0.5, 
0.5-2, 2-5, and 5-10 keV). Our source detection process uses the 
standard SAS tasks EBOXDETECT and EMLDETECT together 
with a custom background fitting task. We perform the source de- 
tection routines on the combined data from all three (MOS 1 , MOS2 
and pn) EPIC detectors. We used simulations to determine detec- 
tion likelihood limits such that we expect only 3% of the final 
sourcelist to be spurious detections. Using these likelihood limits 
we detected 225 sources. The approximate limiting fluxes (in units 
of 10~ 15 erg s _1 cm" 2 ), are 0.5 (0.2-0.5 keV), 0.5 (0.5-2 keV), 
1.2 (2-5 keV), and 5 (5-10 keV). In the 2-5 keV band our sample 
reaches a factor ~ 10 fainter than the knee of the source counts, 
where the contribution to the XRB, per unit log flux, is greatest. 
A full description of the data reduction, source sear ching, a nd de- 
tection threshold determination processes is given in Loari ng et alJ 
(2005). 

The purpose of this study is to test the predictions of a number 
of model AGN populations against AGN in the 13 H field. How- 
ever, we do expect to find a small number of non-AGN sources 



in the full X-ray sample, and these could affect our comparisons. 
Our ongoing optical spectroscopic follow-up program has identi- 
fied counterparts to over 100 of the XMM-Newton sources. In par- 
ticular, the brightest (R < 22) optical counterparts are 92%(81/88) 
identified. Four of the sources, including the brightest source in 
the field, are associated with foreground stars, and therefore are 
not included in this study. We do not expect many of the remain- 
ing optically faint, R > 22, counterparts to be identified with stars. 
EMLDETECT finds four X-ray sources with high likelihood of be- 
ing extended, and with FWH M > 16". Of these f our sources, three 
were identified as clusters bv llones etaL (2002) in an analysis of 
the ROSAT imaging of the 13" field. I Jones et alj (2002) found two 
additional clusters in the 13 H field, however, these are both located 
at the very edge of the EPIC field of view, (where the vignetting 
is most pronounced), and are not detected as extended sources by 
EMLDETECT. Our AGN population models do not include clus- 
ters of galaxies, so we discount the four extended sources from our 
analysis. A small number of fainter clusters are expected to remain 
in the sample, but not be flagged as extended by EMLDETECT. 
We estimate this number by extrapolating the N(> S 0,5-2) plot of 
Ijones et ail 120021) to lower fluxes, whilst incorporating the effec- 
tive area determination of our survey lLoaring et al.l2005l) . Assum- 
ing that the flux limit for detecting extended sources is twice that 
for point sources, and using a N(> So.5-2) slope of 0.5, we pre- 
dict that approximately five additional clusters will remain in the 
sample. 

After the stars and obvious clusters are removed, the resulting 
XMM-Newton sample contains a total of 217 sources of which the 
vast majority are likely to be AGN. 



4 SIMULATION METHOD 

We have devised a Monte Carlo simulation technique which al- 
lows direct comparison of the pattern of X-ray colours produced by 
AGN absorption models, with the pattern seen in the 13 H sample. 
For this, we have extended the XMM-Newton imaging simula- 
tion method of lLoaring et alJ l2005t) . to a multi-band approach. We 
model the EPIC point spread function, vignetting and diffuse back- 
ground in the same way as before. This method accounts for obser- 
vational biases and the complex selection function at work in the 
sample. Each iteration of the simulation method consists of four 
steps, i) We generate an input source population, with each mem- 
ber having a set of randomly distributed parameters L x , z, N H and T, 
from the XLF, f(N H ), and g(T) models, ii) Multi-band count rates 
are calculated for each simulated source according to, L x , z, N H , T, 
and the chosen spectral model, iii) Random source positions are 
assigned, and the sourcelist is folded through a model of the EPIC 
imaging response to create multi-band images for each EPIC de- 
tector, iv) A source-detection chain is carried out on the resulting 
multi-band images to create an output sourcelist. We repeat these 
steps for 100 simulated fields for each f(N H ) model, and for two 
AGN spectral models. Simulated images are produced separately 
for the MOS1, MOS2 and pn cameras, then combined to produce a 
single image in each of the four energy bands for source searching 
purposes. 

4.1 Modelling the AGN population 

To generate the catalogue for the 13 H field. lLoaring et ail 12005) 
used model N(> S ) curves to represent the AGN population inde- 
pendently in each of four energy bands. While valid for monochro- 
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matic studies, this technique is not suitable for colour analyses, 
since it takes no account of the multi-band properties of individual 
sources. We assume that there exists a single intrinsic XLF which 
describes all AGN, which is modified by some distribution of ab- 
sorption to produce the observed XRB, source counts and source 
colours. 

Of the various models for the soft XLF (e.g. [ Bovle et all 
1 1994 IPaee et alj|l997L Ijones et alj|l99l IMivaii et alj|2000h . we 
have chosen to use the L uminosity De pendent Density Evolution 
(LDDE1) XLF model of IMivaii etall i200d) . This was primarily 
because it is based on a large sample of AGN, and its model pa- 
rameters have been determined for the currently preferred lambda- 
dominated cosmology. The sample used to fit this XLF model con- 
tains a mixture of AGN both with, and without, broad lines, sug- 
gesting that it contains a subset of absorbed AGN. We adopt the 
best fi tting parameter values presented in Table 3 of Miv aii et"all 
2000, and where appropriate, have corrected for the Hq = 70 km 
s -1 Mpc~' used in this study. We integrate the XLF over the range 
41 < loeZ. o.5_2 < 48, 0.015 < z < 5 to calculate the total number 
of AGN expected in the field. A ID cumulative probability distri- 
bution is generated by integrating the 2D XLF via an arbitrary path 
in z, Lo.5-2 s P ace - It is then possible to build a list of AGN which 
are randomly distributed in z, according to the model XLF. 

Each of these AGN are assigned a random value of N H according 
to the f(N H ) model being tested, and a spectral slope taken from 
g(T). The absolute normalisation of the XLF is iteratively adjusted, 
so that the simulated fields contain the same source counts as the 
13" sample at So.5-2 = 2 x 10~ 15 erg s _1 cnT 2 . 

4.2 X-ray colours from AGN spectral templates 

We determine the X-ray colours of the simulated AGN by us- 
ing a simple absorbed power-law (APL) model, which also in- 
cludes a correction for the small Galactic absorbing column (Nh ~ 
8 x 10 19 cirT 2 ) found in the 13" field. In order to compare simu- 
lated images with the observations, we must convert from the sim- 
ulated AGN parameters to multi-band EPIC count rates. We use the 
spectral fitting package XSPEC to generate fake spectra, incorpo- 
rating both the instrumental response (for the MOS1, MOS2 and 
pn cameras), and the AGN parameters z, N H and T. These spec- 
tra are summed over the appropriate energy bands, to derive the 
relevant conversion factors. The cost in processing time would be 
prohibitive if we were to individually recalculate these conversion 
factors for each of the thousands of simulated AGN. Hence, we 
have built lookup tables of conversion ratios, which finely sample 
(z,N H ,T) parameter space, covering the range 0.01 < z < 5, 19 < 
logN H < 25, 1.2 < T < 2.6. The conversion ratios are calculated for 
a single luminosity, but then scaled according to the luminosity of 
each simulated AGN. These tables are used to convert rapidly from 
any set of simulated AGN parameters L x ,z,Nh,T to count rates, 
for each EPIC detector, and energy band. 

We have also examined the effect of including a small reflec- 
tion component in the spectral model. This has the net effect of 
hardening the spectrum at higher energies, making simulated AGN 
slightly mo re detectable above 5 keV. We use the PEXRAV model 
of Ma gdziarz & Zdziarskil jl995l) , with the reflecting material cov- 
ering n steradians, a viewing angle of 30 deg, and solar abundances. 
We call this the APL+R spectral model. 

It is beyond the scope of this study to include more complex 
AGN spectral features, such as FeK lines, or scattered soft X-ray 
emission. We expect the effect of these features on AGN colours to 
be small relative to the effects of continuum obscuration. However, 



some of our results suggest that a number of detected sources in the 
13" field have an additional soft component, as discussed later. We 
expect to be able to detect very few (if any) very heavily absorbed 
AGN having \ogN H > 25, and so have not included such objects in 
the simulated populations. In fact, the simulations show us that we 
expect AGN having logN H > 24 to account for only ~ 1 % of the 
detections in the 13" sample. Therefore, any additional attenuation 
due to Compton scattering within the dusty torus is ignored, since 
it has little effect for AGN with logiVjj < 24. 

4.3 Imaging characteristics 

The simulation method incorporates the effects of the EPIC re- 
sponse function, effective area, point spread function (PSF), vi- 
gnetting, and background to produce multi-band images. We use 
the energy and off-axis angle dependent "MEDIUM" accuracy PSF 
model, taken from the XMM-Newton calibration library. This PSF 
model has been measured to be accurate to better than ~ 3% 
at 1.5keV iGondoin 2000). The effective exposure time and vi- 
gnetting are calculated from the SAS generated exposure maps for 
the 13" field. A synthetic background is added to the simulated 
images to reproduce the level observed in the observations. The 
correct level of this additional background was determined through 
an iterative process to account for the contribution from the faint 
unresolved simulated sources. 



4.4 Source detection process applied to the simulated images 

We use the simultaneous, multi-band source detection process on 
the combined simulated MOSl+MO S2+pn images in th e same 
fashion as described for the 13" data lLoaring et alj |2005V How- 
ever, only one iteration of the background determination process 
is carried out, in order to conserve computation time. We have 
searched for sources over the entire useful field of view of the com- 
bined EPIC detectors, giving a total sky area of 0.185 deg 2 . 



5 CAPABILITIES OF THE 13" SURVEY 

The inherent capabilities and limitations of the 13" survey data can 
be precisely evaluated using our simulation method. In this sec- 
tion we refer to sources found in the simulated images by EMLDE- 
TECT as "output" sources. We have employed a simple algorithm, 
that for each output source, associates an input source, in order that 
the colours of the output sources can be related to the input param- 
eters (z, L x , N H , and T). For the majority of output sources, there is 
a single nearby input candidate, which we consider to be the pro- 
genitor. However, in some cases, an output source can have several 
nearby input candidate sources. This problem is exacerbated at high 
off-axis angles because of the degradation of the PSF (and con- 
sequently, the precision of positions reported by EMLDETECT). 
Therefore, we have employed a simple algorithm that matches each 
output source to the brightest input source within a small radius, 
d of the detected position. We make d dependent on off-axis an- 
gle by setting it to 5", 8"and 10" for off-axis angles <9', 9'-12', 
and >12' respectively. Any output sources with no input candidates 
within d are almost certainly caused by Poissonian background 
fluctuations, and so are not considered in this section. However, 
we expect a small number of the detections in the 13" sample to 
be caused by this phenomenon, and hence unrelated to any real 
X-ray source. Therefore, in section [6] we do include those output 
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Figure 3. The fraction of simulated input sources that are matched with an 
output source (Pda), as a function of Afe , normalised to the output/input 
fraction for logiV/y < 20 sources. Results are shown for the APL spectral 
model (solid line), and the APL+R model (dashed line). The plot is com- 
piled from the simulations for the /} = 8 /(Nh) model. 



Figure 2. The 50% detection limits for simulated sources as a function of 
z and Lo.5-2 for the absorbed power-law spectral model. The results for 
four ranges of absorption are shown: < log Nh < 21.5 (solid), 21.5 < 
logA^ < 22.5 (long dash), 22.5 < \ogN H < 23.5 (short dash), and 23.5 < 
logN« < 25.0 (dotted). 



simulated sources having no input candidates when comparing the 
simulated AGN populations to the sample. 



5.1 Selection function 

To determine the selection function of the simulated AGN, we eval- 
uate the fraction of matched output/input sources, as a function of 
the input parameters z,L x ,N H , and T. Fig.|5|shows the 50% com- 
pleteness limit, as a function of redshift and luminosity, for sev- 
eral different levels of absorption. The contours show the loci in 
z - L 5 -2 parameter space, at which half of the input sources are de- 
tected. There is a clear reduction in detection probability for AGN 
having absorbing columns above 10 23 5 cm 2 , and this effect is more 
marked at low-z. The plot shows that the 13 H survey is able to de- 
tect the majority of moderately luminous AGN (logLo.5-2 > 44), 
with moderate-absorption (21.5 <logN H < 22.5) up to z ~ 3.5. 

Fig. shows the fraction, as a function of absorption, of all 
input sources that are matched to output sources in the simulated 
images. This highlights the small differences in detection probabil- 
ity between the two spectral models. The addition of a reflection 
component in the AGN spectra has a rather small effect on the de- 
tectability of simulated sources. 

The dependence of the selection function on T can be seen 
in fig.0 which shows the fraction of simulated input sources with 
output counterparts, as a function of spectral slope. It can be seen 
that the spectral slope of an AGN has a small but measurable bear- 
ing on its probability of detection. A strong increase in detection 
probability is seen for very hard sources (r < 1.4), however, the 
inset histogram shows that very few of these objects are predicted 
by the g(T) model. We have used the 0.5-2 keV de-absorbed flux 
to normalise the model spectra, and so the hard-slope AGN have 
a relatively high countrate above 2 keV, and are more likely to 
be detected. This effect is larger for moderate to heavily absorbed 
sources, since they are primarily detected at these harder energies. 
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Figure 4. The fraction of simulated input sources that are matched with 
an output source (Pdet), as a function of T, normalised to the output/input 
fraction for T = 1.9 sources. Results are shown for the APL spectral model 
(solid line), and the APL+R model (dashed line). The plot is compiled from 
the simulations for the fl = 8 /(Nh) model. The inset shows the number of 
output sources, per simulated field, as a function of spectral slope. 



The impact on the overall selection function is largest for the f(N H ) 
models containing the largest fraction of absorbed sources, i.e. the 
R = S,R = R(z) models. 



5.2 Sensitivity to X-ray colours 

Constraints on f(N H ) models can be made from anal ysis of X-ray 
colour (i.e hardness ratio) distributions. For example. IPerola et alJ 
(2004) compared the N H of XMM- Newton sources determined 
from full spectral fits, with the N H estimated using a hardness 
ratio method (over the 0.5-10 keV range), and showed that they 
were in good agreement for \ogN H > 22. For this study we de- 
fine the hardness ratios HRl = (Ro.5-2 - ^o.2-o.s)/(^o.5-2 + ^o.2-o.sX 
HR2 = (R 2 - 5 - Ro.s-2)/(R2-5 + Ro.5-2) and HR3 = (R 5 -w - 
R2-5)/(Rs-io + ^2-5), where ^£,„„,-£,„ ov is the source count rate, cor- 
rected for vignetting, in the given energy band. The corresponding 
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Figure 5. Colour-colour distributions of simulated sources for the APL spectral model, showing HRl vs HRl (upper row) and HRl vs HR3 (lower row). The 
panels show the colours produced by simulated sources grouped into bins according to their intrinsic absorption (0 < logN^ < 21.5, 21.5 < logNu < 22.5, 
22.5 < logN H < 23.5, and 23.5 < logN H < 25.0). The levels of the contours are set such that they include 50% (short dash), 75% (long dash), and 90% 
(solid line) of the sources. The contribution of each simulated source to the greyscale map was represented by an ellipical Gaussian centered on the measured 
position in colour-colour space, and having widths equal to the corresponding cthr- We also show the locus of expected colours for an AGN with an APL 
spectrum, Nu in the logarithmic center of the interval, T = 1.9, and with < z < 5 (numbered points indicate z). 



measurement errors are denoted by cr HRl , a- HR2 , cr HR 3 respectively. 
The count rates, hardness ratios, and errors are computed within 
EMLDETECT using the combined dataset from the MOS1, MOS2 
and pn cameras. If any hardness ratio measurement is undetermined 
(zero countrates in two energy bands), we set it to 0.0 ± 1.0. 

The dependence of HRl, HRl and HR3 on absorption is illus- 
trated in fig.|3] which shows the measured colour-colour distribu- 
tions of "output" simulated sources grouped into a number of Nh 
bins. For each N H bin, we have over-plotted the "perfect" z-track in 
colour-colour space for an AGN with mid-bin absorption, T = 1.9, 
and < z < 5. Both the width of the N H bins, and the range of T in 
the simulated sources, act to distribute sources about this track. The 
relative density of the distribution along the track is mostly deter- 
mined by the evolution of the XLF, which peaks above z = 1.5. In 
addition, a significant amount of scatter is caused by measurement 
uncertainties within the source detection process, particularly for 
the faintest sources. 

We see from the left-most upper panel of fig. [5] that the 
colour distribution of output simulated sources with logN H < 21.5 
is compact, and approximately centered on (HRl, HRl) = (0.2,- 
0. 5). The study of XM M-Newton sources in the Lockman hole 
bv lMainieri et alJ 120021) . found that the vast majority of AGN in 
this part of hardness ratio space had broad line optical counter- 
parts and, at most, weak absorption (logN H < 21.5) in their X- 
ray spectra. In contrast most of the identified AGN having hard 
X-ray colours had narrow line optical counterparts, although only a 
small fraction of the hard sources had optical identifications. Ex- 
amination of the three upper right panels reveals that the mod- 
erately to heavily absorbed sources (logN H > 21.5), occupy a 
measurably different region of HRl, HRl space compared to their 
less absorbed counterparts. In particular, HRl is sensitive to ab- 
sorption in the range 21.5 <logN H < 23.5, and HRl to absorp- 



tion ab ove \ogN H = 22.5. In the study of tGeorganto poulos et alJ 
(2004), the hardness ratio between the 0.5-2 and 2-8 keV bands 
did not appear to separate the broad and narrow line AGN; how- 
ever, relatively few of the hard er AGN in this s ample had spec- 
troscopic identifications. iDella Ceca et alJ 120041) showed that the 
majority of AGN with broad line counterparts fall in the range 
-0.75 < HRl < -0.35, consistent with the location of the low 
absorption AGN (logN H < 21.5) produced by our simulations. 



As we would expect, the majority of the simulated faint un- 
absorbed AGN do not have good measurements of HR3. These 
sources have noise dominated countrate measurements above 
2 keV, and hence have HR3 measurements randomly scattered in 
the interval [-1, 1]. Of the simulated AGN having \ogN H > 23.5, 
it is only the most luminous (£0.5-2 > 10 44 erg s~'), that are de- 
tectable in our survey, as shown in figs.[3|and[3] The bottom right 
hand panel illustrates that HR3 is sensitive to absorption above 
logiVfl = 23.5 for all but the highest redshift AGN. Hard band 
X-ray cou nt rates were well dete rmined for sources in the bright 
sample of ICaccianiga et alJ 120041) . and of the four objects with a 
higher count rate in the 4.5-7.5 keV band than in the 2-4.5 keV 
band, three were associated with narrow line optical counterparts, 
and one with a Seyfert 1.9 galaxy. 



Figure |6| shows the HRl vs. HRl (upper row) and HRl vs. 
HR3 (lower row) distributions produced by three of the f(N H ) mod- 
els. The most immediately noticeable difference between the plots, 
is the fraction of sources that appear to the right of HRl = 0.6 for 
the various f(N H ) models. 
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Figure 6. Colour-colour distributions produced by different f(Nu) models compared to that seen in the sample data. The panels show the 13" data (top left), 
and then the results for three of the simulated f(Nfj) models (using the absorbed power-law spectral model). The levels of the contours are set such that they 
include 50% (short dash), 75% (long dash), and 90% (solid line) of the smoothed source distribution, and were generated in the same way as for fig. 151 



6 RESULTS 

6.1 Colour distribution of the 13" sample 

The two left-most panels of fig.[6|show the colour-colour distribu- 
tions of the 13" sample, with grey-scale and contours generated in 
the same way as for fig|5] Fig.0shows the same contours, but with 
the individual data points overlaid. Figs.|6|andQshow that there is 
a strong concentration of sources in the (HRl, HRl, HR3) = (0.4,- 
0.5,-0.5) region, slightly harder in HRl than the nominal position of 
an unabsorbed AGN with logN H < 21, T = 1.9. A large number of 
sources have much harder values of HRl and HRl than the nomi- 
nal unabsorbed position, indicating that strong absorption is present 
in a significant fraction of the population. However, the majority of 
the sources in the HRl < 0, HR3 > -0.3 region are actually faint 
soft sources having large HR3 measurement uncertainties. The bi- 
modality apparent either side of HRl = 0.6 is probably due to the 
fast increase in HRl over the range 21.5 < logNn < 22.5, which 
limits the number of sources in this region. A similarly sparse re- 
gion occurs at HRl ~ 0.25 and again, this is probably related to the 
fast increase in HRl over the range 23 < logN H < 24. 

6.2 Reproducing the observed source counts 

We have compared the 0.5-2 keV band integral source counts, 
N(> S 0.5-2), measured in the 13" sample with those produced by 
the simulated model distributions. We make no correction for sky 
coverage, since the 13" sample and the simulated fields have an 
identical survey-depth/sky area relation. We find a large disparity 
between the 13" and simulated fields, especially around the knee 
of the observed N(> S 0.5-2) at ~ 10" 14 erg s _1 crrr 2 , as can be seen 
in fig. [HI Each of the f(N H ) models produced similar N(> 5 1 0.5-2) 
curves, especially at faint fluxes, where the statistical errors are 
better. Thus we deduce that the data-model disparity is primarily 
caused by differences between the data and the model XLF (and/or 
its evolution). We discuss this later. The primary purpose of this 



study is to compare the f(N H ) models, so it is important that we 
minimise the effect on the statistical analysis caused by the dispar- 
ity between the data and XLF/evolution model. Therefore, we have 
examined the X-ray colour distribution of sources, rather than the 
distributions of their absolute fluxes. We expect the colour-colour 
distributions to be more sensitive to f(N H ) than to the XLF, be- 
cause a small change of the position of an absorbed AGN in the 
z-Lx plane, has a strong effect on its overall brightness, but only a 
small effect on its X-ray colours. For example, if the peak of AGN 
space density is actually at z = 1.3, (rather than at z = 1.6 as pre- 
dicted by the XLF/evolution model), then the resulting change in 
hardness ratios for an AGN, having an absorbed power-law spec- 
trum, log2V H = 22, at this peak redshift, would be AHRl(AHRl) 
= +0.07(+0.03). However, an increase of 0.5 dex in the absorption 
of the same AGN, would result in AHRl(AHRl) = +0.23(+0.17). 
Therefore, a colour analysis of the f(N H ) models is more strongly 
dependent on the tested f(N H ), than on differences between the data 
and XLF/evolution model. 

6.3 Statistical comparison of colour distributions in the data 
and the models 

We have used the Kolmogorov-Smirnov test (KS), to determine 
how well the simulated data reproduce the X-ray colour distribu- 
tion measured in the 13" sample. The KS test has the advantage 
that it requires no rebinning of data, utilising the full information 
content of the data set. However, it does not take into account the 
relative errors on data points, meaning that low signal to noise 
measurements can, to some extent, "wash out" the signal from the 
more precise measurements. A three dimensional extension of the 
KS test (3D-KS), as devised by Fasano & Franceschini 1 1987), was 
used to compare the sample with the simulation results in the full 
(HR1,HR2,HR3) variable space. In order to examine more closely 
how the models reproduce the sample distribution, we have carried 
out one dimensional KS tests separately on HRl, HRl and HR3. 



Constraints on the distribution of absorption in the X-ray selected AGN population 9 




1 1 




Figure 7. X-ray colour-colour distributions found in the 13" sample. The 
levels of the contours are set such that they include 50% (short dash), 75% 
(long dash), and 90% (solid line) of the smoothed source distribution, and 
were generated in the same way as for fig. [5] Typical sizes of (Thri , thri, 
and ctur3 are shown with boxes, for sources having "two-band" fluxes of 
10~ 14 ' 5 and 10~ 14 erg s -1 cirT 2 , where the "two-band" flux is the flux 
measured over the two energy bands used to calculate the hardness ratio. 

The conversion from the 3D-KS test statistic, Dt, D - K s , to the 
probability that two samples were taken from the same underlying 
population, P^d-ks . is strongly dependent on the number of sources 
in the tested samples, and the degree o f corr elation between the 
tested variables. Fasano & Franceschinj ll987l) numerically gener- 
ated lookup tables to allow this conversion at a number of confi- 
dence levels, for a range of sample sizes, and values of the corre- 
lation parameter. However, these tables give only a relatively small 
number of conversion values, at discrete confidence levels, sam- 
ple sizes, and values of the correlation parameter. Therefore, we 
have run a set of simulations, the results of which permit conver- 
sion from D 3D _ KS directly into P^d-ks conversion using the precise 
sample sizes and correlations seen in the 13 H sample. We calcu- 
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Figure 8. The A'(>So.5-2) curves measured in the 13" sample (filled cir- 
cles), compared to those produced by the fl = 2 (solid line), ft = 8 (long 
dash), R = 4 (short dash), R = R(z) (dotted), and R = R(L X ) (dot-dash), 
f(Ng) models. These results are for the absorbed power-law spectral model 
and are normalised to the Euclidean-slope. The equivalent points for the 
sample of Mivaii et all200 (taken from fig. 6), are also shown (open sym- 
bols with errorbars), and have been normalised assuming a sky area of 0. 1 85 
deg 2 . 

lated the three-dimensional probability density map (3D-PDM), of 
the 13" sample in (HR1,HR2,HR3) space. The contribution from 
each source to the 3D-PDM is calculated from a 3D-Gaussian that 
has widths equivalent to <r HRi ,cr HR 2, and cthrt,- The normalisation 
of the 3D-Gaussian is set such that the total contribution of each 
source is unity. This 3D-PDM is used to generate pairs of ran- 
dom populations, having 217 and 25000 members respectively, for 
which Did-ks is calculated. The latter step is repeated for 100000 
iterations. The equivalent probability for any particular value of the 
3D-KS statistic, is equal to the fraction of these iterations having 
D^d-ks greater than this value. The absolute lower limit at which 
we can evaluate the probability is given by the reciprocal of the 
number of simulation iterations, i.e. 0.001%, although the errors 
are large at this level. This limit is determined by the processing 
time available. TableQshows P},d-ks, for the eight f(N H ) models, 
and for both of the tested spectral models. In order to determine 
where the biggest differences arise between the data and models, 
we have calculated the KS probabilities (Pus X separately for each 
of HR1, HR2 and HR3, the results of which are also shown in table 
□ 



7 SELECTING ABSORBED SOURCES 

If we wish to examine just moderate to heavily absorbed AGN, then 
we need some X-ray colour selection criteria which will allow us to 
choose only this population. Examination of the simulation results 
(see fig. shows that a cut of HR2 > -0.3 will select the majority 
of the most heavily absorbed sources {\ogN H > 22.5). This HR2 cut 
is the same as that shown to discriminate efficiently between optical 
type-1 and type-2 AGN i n the XMM-Newton Bri ght Serendipitous 
Surve y (XBSS) sample ICaccianiea et aljl2004l iDella Ceca et alJ 
2004). However, it should be cautioned that the XBSS definition 
of HR2 differs slightly from our own; they use the 2-4.5 keV 
energy band (rather than our 2-5 keV band), and report HR2 
only for the MOS2 dataset. AGN with absorption in the range 
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Table 1. 3D-KS and KS test probabilities, calculated by comparing the distributions of HRl, HR2 and HR3 produced by the eight tested f(Nn) models with 
that found in the 13" sample. Results are shown for both the absorbed power-law (APL), and absorbed power-law with reflection (APL+R), spectral models. 
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Figure 9. The fraction of output simulated sources which are selected by 
the hardness ratio cut HRl - cr HR1 > 0.6 OR HR2 - <r HR2 > -0.3, as a 
function of Nh The solid line shows the result for the absorbed power-law 
spectral model, and the dashed line shows the result with an addition of a 
reflection component. 



21.5 < \ogN H < 22.5 are included by adding the region with 
HRl > 0.6. To reduce the number of faint soft sources, having 
low signal-to-noise measurements, that are scattered into the "hard" 
sample, we require that a source satisfies the above conditions by 
more than la to be included. Fig.[9]demonstrates the effectiveness 
of such a cut in selecting only those sources with significant N H . 
The slight dip in the selected fraction at logA^ > 24 is caused 
by the generally larger errors on HRl I HRl for the most heavily 
absorbed sources. This evaluation of the effectiveness of the selec- 
tion scheme assumes a simple absorbed power-law AGN spectral 
model. Spectral features such as an additional soft component, will 
serve to degrade this efficiency. Because of the relatively poor av- 
erage measurement accuracy for HR3, we have not used it to select 
absorbed sources. These "hard" selection criteria, when applied to 
the 13 H sample, result in 86 hard sources ( 39% of the total). This 
value is consistent with the fraction (34 + 9%) of optical type- 
2 AGN identified in th e 2-4.5 keV selected subset of the XBSS 
iDella Ceca et all2004h . The "hard" fraction for each of the f(N H ) 
models are presented in table [5] We see that the fractions of hard 
sources produced by the j8 = 8, R = 8, and R = R(z) models are 
consistent within 3<r with the fraction seen in the 13 H sample. By 
including a reflection component in the spectra, the hard fraction is 
increased by less than 2% for all but the R = R(L X ) model. 



8 DISCUSSION 

8.1 Reproducing the colours in the sample 

Table Q shows that, when we consider an absorbed power-law 
spectral model, none of the f(N H ) models provide a good descrip- 
tion of the X-ray colours in the 13 H sample, and are all strongly re- 
jected by the 3D-KS test (with greater than 99% confidence). The 
P = 8 model provides the best fit with a probability of 0.8%; al- 
though low, this value includes the effects of the disparity between 
the data and the XLF/evolution model. However, the addition of a 
reflection component to the AGN spectra improves the P^d-ks f° r 
almost all of the f(N H ) models. The best fitting distribution is still 
the p = 8 model, but with a much improved probability of 6%. The 
remainder of the f(N H ) models are strongly rejected by the 3D-KS 
test, with greater than 99.5% probability. This match between the 
P = 8 model and the sample, is actually rather a good one, consid- 
ering that the only tuned parameter is the overall normalisation of 
the XLF. The large range of P^d-ks that is measured between the 
/(Nh) models, demonstrates that our colour analysis technique is 
indeed a good probe of the underlying distribution of absorption in 
the sample. 

The results of KS tests on individual hardness ratios reveal 
more clearly where the f(N H ) models succeed or fail to reproduce 
the sample colours. When HR3 is considered, the addition of a re- 
flection component to the absorbed power-law spectral model im- 
proves the fit of all the tested f(N H ) models. However, there is 
not such a consistent improvement in the fits for HRl and HRl. 
The HRl distributions produced by the p = 8, R = 4, R = 8 
and R = R(z) models closely match the distribution seen in the 
13" sample, with KS probabilities greater than 30% when a reflec- 
tion component is included in the model spectra. However, large 
differences between the models and the sample arise in the distri- 
butions of HRl and HR3. The tested f(N H ) models over-produce 
the fraction of sources having very hard colours below 2 keV 
(HRl = 1) relative to the 13" sample. In the 13" sample, the frac- 
tion of sources having HRl = 1 is around 8%; however, even in 
the best fitting p = 8 model, the fraction is around 15%. We find 
that almost all (90%) of the simulated sources having HRl = 1 
have logA'/y > 22, and so have had virtually all their flux below 
0.5 keV removed by absorption. However, these sources are dis- 
tributed similarly to the rest of the population in L x , z and T. The 
relatively low P KS (HRl) for the R = 4, R = 8 and R = R(z) mod- 
els, can be partly attributed to their low number of lightly absorbed 
AGN (20 < logN H < 11). Within this subset of models, the evolv- 
ing R = R(z) model is strongly favoured over the R = 4 model, but 
is marginally less successful than the R = 8 model. However, the 
latter model is unphysical in that it contains a much greater ratio of 
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absorbed to unabsorbed sources than is seen in the local universe. 
When the HR3 distribution is considered, we find the models pro- 
duce too large a fraction of simulated sources having HR3 = - 1 . 
This is most probably caused by the over-abundance of very faint 
sources produced by the simulations, related to the XLF mismatch. 
These sources are detected just above the flux limit in the softer 
bands, but have count rates which fall below the background level 
in the hardest band, and hence are measured to have HR3 ~ - 1 . 

The statistical analysis strongly rejects the R = R(L X ) m odel, 
in agr eement with the findings of a recent study bv iTreister et alJ 
(2004), which was based on deep multi- wavelength data in the 
GOODS fields. These authors tested the f(N H ) model o flUeda et all 
(2003), alongside a simpler /(Nh), but found that the latter pro- 
vided a much better description of the data. 

By examining the subset of sources satisfying the "hard" se- 
lection criteria, we can compare the distributions of absorption 
above logNn = 22 that are found in the 13" sample with those pre- 
dicted by the models. We have carried out 3D-KS and KS tests on 
HR1, HR2, and HR3, as before, but only for the "hard" selected 
subsets of the 13" sample and simulations. The 3D-KS test re- 
jects each of the f{N H ) models with high confidence, (both with 
and without a reflection component included in the model spectra). 
We have examined the individual KS test results to determine the 
source of this large disparity. We find that the KS probabilities for 
the best fitting fj = 8 model, (with the absorbed power-law spectral 
model), are 0.0003, 0.76, and 0.12, for HRl, HR2 and HR3 respec- 
tively. The equivalent probabilities when an additional reflection 
component is included in the model spectra are 0.0002, 0.77, and 
0.29. The KS test probabilities do not vary greatly between the dif- 
ferent f(N H ) models (excepting the R = model). The HR2 and 
HR3 distributions of all the /(Nh) models (excepting the R = 
model) provide rather good matches to the HR2 and HR3 distribu- 
tions found in the "hard" subset of the \3 H sample. The differences 
between the "hard" subsets of the /(Nh) models are small, due to 
the rapid decline in the selected fraction of "input" sources for high 
absorbing columns (see fig.[3)- This acts to diminish the importance 
of the differences between the /(Nh) models above Nh = 10 22 
crrr 2 . The addition of a reflection component to the spectral model 
improves the KS probability for HR3 by a factor of ~ 2. 

We see that the mismatch between the HR\ distributions is 
much worse in the "hard" subset, compared to the sample as a 
whole. This appears to be due to the overproduction of simulated 
sources having HR\ = 1, which is more pronounced in the "hard" 
sub-sample. The fraction of the "hard" sample with HRl = 1 is 
20% for the 13" field, but ~ 40% for the model populations. The 
disparity could be explained if a number of the heavily absorbed 
AGN have an additional soft X-ray component in their spectra. 
In order to reproduce the distribution of HRl, this phenomenon 
should occur in around 10-20% of the heavily absorbed sources. A 
number of absorbed AGN with excess soft emission have been ob- 
served by other aut hors in samples of spec troscopically identified 
X-ray sources (e.g. ICaccianiea et alJl2004lPaee et all2005h . This 
excess component could be due to intense starbursts in the host 
galaxy, or to diffuse emission surrounding an AGN embedded in 
a galaxy cluster. Alternatively, it could be scattered radiation from 
the central engine of the absorbed AGN. 

8.2 Implications for torus models 

For the simplest toy model of a torus with uniformly density, and a 
typical opening angle, O , the fraction of AGN that are heavily ab- 
sorbed is approximately cos{8„). So, if we use the size of the "hard" 



Table 2. Fraction of "hard" sources, h, (satisfying HRi - <thr\ > 0.6 OR 
HR2 - <TfjR2 > —0.3), produced by each simulated f(Nn) model. The cor- 
responding fraction seen in the 13" sample is 0.39 (86/217). Results are 
shown for an absorbed power-law spectral both with (APL+R) and without 
(APL) a reflection component. The standard deviation of the hard fraction 
07,, over the 100 simulation repetitions is also shown. 
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fraction of the 13 H sample as a measure of the number of absorbed 
AGN, we can infer a rather wide opening angle of o ~ 67° . How- 
ever, this estimate does not take into account the effect of the drop 
in the selection function toward high Nh, and can only be seen as 
an upper limit on O . We estimate the relative selection function 
for hard sources by counting the fraction of simulated "hard" input 
sources that have output counterparts relative to that for all input 
sources. Applying this correction to the 13 H sample, we predict an 
intrinsic "hard" fraction of ~ 0.8, implying an opening angle of 
9„ ~ 37° . If in our correction for the relative selection function, we 
exclude those sources with absorbing column above logN H = 24, 
where our sample constrains the models only weakly, then we find 
o < 52°. We are also able to examine the range of torus param- 
eters that would best match the /(N H ) models. For the best fitting 
P = 8 model, where the fraction of input sources with logN H > 22 
is ~ 0.75, the predicted opening angle is o ~ 42°. 

As fig. [5] shows, HRl and HRl are sensitive to the shape of 
the distribution over a wide range of N H , particularly for interme- 
diately absorbed sources. We have seen that the /3 = 8 is strongly 
favoured over the R = 4 /(Nh) model (see table 0. These two 
models are very similar in the range 22 < logN H < 24, contain 
similar numbers of unabsorbed AGN (log N H < 21), and produce 
comparable numbers of "hard" sources. Therefore, the difference 
must lie primarily in the 21 < logN H < 22 range, in which the 
/3 = 8 model contains many more AGN. A major problem with the 
uniformly dense torus model is that it predicts that nearly all AGN 
will be either heavily absorbed or completely unabsorbed. How- 
ever, more complex models, incorporating a wide distribution of 
torus densities, predict larger numbers of intermediately absorbed 
AGN. For example, a model in which the density falls off exponen- 
tially with angle away from the plane of the torus, predicts a much 
flatter f(N H ) (e.g. ITreister et ai]l2004 . It is possible, with some 
tuning of such a model's parameters, to approximately match the 
best fitting /3 = 8 distribution. 

Since absorption in the 21 < logN H < 22 range has only 
a significant effect on HRl, it would not have been detectable 
in the colour distributions if the 0.2 - 0.5 keV band had not 
been considered. A number of studies of absorption in faint AGN 
have based their estimates of N H on hardness ratios between the 
0.5-2 and 2-10 keV bands, and therefore may have un- 
derestimated the nu mber of interm ediately absorbed AGN (e.g. 
lUeda et alj2003lTreister et alj2004 . 

A better determination of mean torus properties will be pos- 



12 Dwelly 



sible when the 13" field is covered by Spitzer , and we are able to 
correlate X-ray colours with mid/far-IR data. 



8.3 Source count disparity 

Each of the simulated f{_N H ) models produced similar 0.5-2.0 
keV source count-flux relations, N(> S 0.5-2)- However, these are 
seen to reproduce poorly the N(> S 0.5-2) relation observed in the 
13" sample (see fig.[SJ. The models under-produce the N(> S 0.5-2) 
above the normalisation flux (2 x 10~ 15 erg s _I cnr 2 ), and over- 
produce the N(> S 0.5-2) bel° w this flux ( see fig-El- m f act > at 10~ 14 
erg s cirT 2 the models under-produce the source counts seen in 
the 13" sample by a factor of about two. This disparity is seen 
to a similar degree in each of the f(N H ) models, suggesting that 
it is related to the difference b etween the data an d XLF/evolution 
model. The N(> So 5-2) °f me Mivaii et"al] fcOOfj) sample, is also 
shown in fig. [8] plotted assuming our field has a uniform sky area 
of 0.185 deg 2 . This illustrates that in the flux range 10~ 14 - 10~ 13 
erg s _1 cirT 2 , the LDDE1 XLF also under- produces the source 
counts of the sample from whic h it was derive d. The shape of the 
N(> So .5-2) relation of the Miv aii et"all fcOOof) sample is closer to 
that s een in the 13" samp le than to the models. The faintest AGN 
in the lMivaii et alJ 120001) sample are from the deepest ROSAT ob- 
servations of the Lockman Hole field, where the flux limit of the 
data was ~ 2 x 10~ 15 erg s _I cirT 2 . Our significantly deeper flux 
limit means that we are using p art of L - z s pace outside that con- 
strained by the sample of Mivaii et"al] feOOfJ) . A previous compar- 
ison of source counts from ROSAT observations in the Lockman 
Hole and 13" fields, revealed a ~ 10 - 20% over-abundance near 
S 0.5-2 = 10" 14 erg s" 1 cm -2 in the 1 3 H field with respec t to the 
Lockman Hole lMcH ardvet"ai]|l998l) . In addition, Loarina et al. 
(2005) found that the 13" field is slightly over-dense in the 0.5-2 
keV band with respect to both of the Chandra deep fields. There- 
fore we conclude that the differences between model and sample 
are caused by a combination of these factors. In particular, our ex- 
trapolation of the LDDE1 XLF/evolution model to faint fluxes, sug- 
gests that this complex scheme requires some revision. 



8.4 High-z AGN in the 13" sample 

The shape of the XLF at high redshift is poorly known because 
of the difficulties in obtaining a large spectroscopically identified 
sample of these objects. We can use the simulated source popu- 
lation to make predictions about the number of high-z AGN in 
the 13 H sample. Each of the f(N H ) models predict that around 
16% of the total number of X-ray detections are due to AGN with 
3 < z < 5. Therefore, it can be inferred that the fraction of 
AGN with z > 3 in the X-ray population is primarily dependent 
on the shape of the underlying XLF and its evolution, rather than 
the Nh distribution within the high-z population. The model predic- 
tions suggest that we should expect around 35 high-z AGN in the 
13" field. However, only a single AGN has been identified having 
z > 3 by our follow up optical spectroscopy program (which has 
secure IDs for over 100 sources). This disparity is maybe due to 
the over-production of faint sources by the XLF/evolution model; 
these are more likely to be at high z. The X-ray detection proba- 
bility of the z > 3 AGN is much less dependent on N H than for 
the low-z AGN, since most absorption is redshifted below 2 keV. 
Therefore, most of the f(N H ) models predict that absorbed AGN 
make up the majority of the detected high-z population, the precise 
fraction being dependent on the particular f(N H ) model. However, 



the absorption of optical and UV spectral features does severely 
affect the probability of identification for these objects. We have 
recently obtained further deep optical imaging of the 13" field in 
several bands, which will permit us to make photometric redshift 
estimates for some of the optically faint sources. The forthcoming 
deep coverage of the 13 H field in the infrared with Spitzer will fur- 
ther constrain the nature of the high-z population. 



9 SUMMARY 

We have demonstrated how a colour-based analysis of deep XMM- 
Newton data can be used to constrain models of absorption in the 
AGN population without requiring complete optical spectroscopic 
follow up. By using a detailed simulation technique, we have been 
able to take account of the complex selection function at work in 
the sample, and demonstrate how this modulates the input popula- 
tion. We have shown that a simple f{N H ) model together with an 
absorbed power-law spectral model (including a reflection compo- 
nent), reproduces the observed HRI/HR2/HR3 colour distribution 
with probability 6%. All of the other model N H distributions that 
we compared were rejected at greater than 99.5% probablity. In 
particular, two more complex f(Nn) models are strongly rejected 
by the 3D-KS test; the redshift dependent R = R(z) model produces 
too many hard sources, and the luminosity dependent R = R(L X ) 
model produces too few. In general, the addition of a reflection 
component to the absorbed power-law spectral model improved the 
match between the colour distributions of the models and the sam- 
ple. The reflection component serves to harden the spectral slope 
at higher energies, and its effect was most evident in the HR2 and 
HR3 distributions. We have shown that there is a large disparity 
between the shape of the N(> 5 0.5-2) produced by the models and 
that found in our sample. We suggest that this is for the most part 
due to differences between the actual L — z distribution of sources 
in the 13" field, and the XLF/evolution model that we have used. 
These XLF/evolution differences will have had some effect on the 
colour distributions produced by the f(N H ) models, and could ex- 
plain the surfeit of HR3 = -1 sources produced by all of the f(N H ) 
models. We have seen some evidence that suggests that the spec- 
tra of a significant fraction of absorbed sources in the \3 H sample 
have an additional soft X-ray component. This feature was not in- 
cluded in our spectral models, and therefore contributed a large part 
of the disparity between the HRl distributions of models and sam- 
ple. Considering this factor, together with the XLF/evolution dif- 
ferences, we conclude that the 6% probability for the f} = 8 model 
shows that it provides a rather good fit to the data. The shape of the 
P = 8 distribution can be broadly reproduced using a toy model for 
the torus in which the density falls away rapidly for viewing angles 
away from the plane of the torus. We have shown that AGN having 
logN H > 22 can be efficiently selected by choosing sources in the 
regions HRl - cr HRl > 0.6 and HR2 - (T HR2 > -0.3. We intend to 
extend the methods described here to further XMM-/Vewto/1 deep 
fields in order to increase the sample size, and to reach to fainter 
X-ray fluxes. 
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